Single-Threaded Mode AVF Prediction During Redundant Execution
نویسندگان
چکیده
Transient faults can lead to serious errors in execution. Providing protection for the processor core against these faults requires redundant execution, which leads to a performance loss. However, not all bit flips have equal impact on the processor. The Architectural Vulnerability Factor (AVF) quantifies when a soft error is likely to alter the final output and when it has little impact due to the effects of masking. Thus, redundancy is only important during periods of high AVF. Although calculating the AVF typically requires post-execution analysis of the microarchitectural behavior of a program, recent work has shown it can be estimated online. However, redundant execution changes the bits that flow through the processor, exposing bottlenecks that single-threaded execution may not display and slowing overall execution to an unpredictable degree. This variability complicates estimation of the single-threaded AVF during redundant execution, making it difficult to decide when protection is unnecessary due to low vulnerability. To leverage these low AVF periods without leaving the processor vulnerable to transient faults, we need a way to track the single-threaded AVF even when protection is enabled. Our solution is to investigate the predictability of the single-threaded AVF during redundant execution and develop predictors for the underlying AVF of three processor structures. We then evaluate these predictors in a partial RMT implementation using intelligent toggling with a sample reliability policy.
منابع مشابه
Relaxed Determinism: Making Redundant Execution on Multiprocessors Practical
Given that the majority of future processors will contain an abundance of execution cores, redundant execution can offer a promising method for increasing the availability and resilience against intrusions of computing systems. However, redundant execution systems rely on the premise that when external input is duplicated identically to a set of replicas executing the same program, the replicas...
متن کاملA Race-Detection and Flipping Algorithm for Automated Testing of Multi-threaded Programs
Testing concurrent programs that accept data inputs is notoriously hard because, besides the large number of possible data inputs, nondeterminism results in an exponentially large number of interleavings of concurrent events. In order to efficiently test shared-memory multithreaded programs, we develop an algorithm based on race-detection and flipping and illustrate how it can be combined with ...
متن کاملHighly-Decoupled Thread Level Redundancy
Continued scaling of device dimensions and operating voltage reduces the critical charge and thus natural noise tolerance level of transistors. As a result, circuits can produce transient upsets that corrupt program execution and data. To prevent operational failure due to these errors, system-level techniques such as redundant execution will increasingly be required for fault detection and tol...
متن کاملPOWER5 system microarchitecture
microarchitecture B. Sinharoy R. N. Kalla J. M. Tendler R. J. Eickemeyer J. B. Joyner This paper describes the implementation of the IBM POWER5e chip, a two-way simultaneous multithreaded dual-core chip, and systems based on it. With a key goal of maintaining both binary and structural compatibility with POWER4e systems, the POWER5 microprocessor allows system scalability to 64 physical process...
متن کاملJESSICA2: A Distributed Java Virtual Machine with Transparent Thread Migration Support
A distributed Java Virtual Machine (DJVM) spanning multiple cluster nodes can provide a true parallel execution environment for multi-threaded Java applications. Most existing DJVMs suffer from the slow Java execution in interpretive mode and thus may not be efficient enough for solving computation-intensive problems. We present JESSICA2, a new DJVM running in JIT compilation mode that can exec...
متن کامل